Semantic Cosine Similarity

نویسندگان

  • Faisal Rahutomo
  • Teruaki Kitasuka
  • Masayoshi Aritsugi
چکیده

Cosine similarity is a widely implemented metric in information retrieval and related studies. This metric models a text as a vector of terms and the similarity between two texts is derived from cosine value between two texts' term vectors. Cosine similarity however still can't handle the semantic meaning of the text perfectly. This paper proposes an enhancement of cosine similarity measurement by incorporating semantic checking between dimensions of two term vectors. This strategy aims to increase the similarity value between two term vectors which contain semantic relation between their dimensions with different syntax. Experimental result shows our proposal yields a promising result.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Are Cited References Meaningful? Measuring Semantic Relatedness in Citation Analysis

In this proof-of-concept study we use standard cosine similarity measure to calculate the semantic similarity between two pieces of text – the citing document and the cited text. Three subject matter experts then evaluate the citing and the cited text based on the cosine score to give their judgement on the semantic similarity between the two pieces of text.

متن کامل

Comparison of cosine similarity and k-NN for automated essays scoring

In this paper, a comparison between Cosine Similarity and k-Nearest Neighbors algorithm in Latent Semantic Analysis method to score Arabic essays automatically is presented. It also improves Latent Semantic Analysis by processing the entered text, unifying the form of letters, deleting the formatting, replacing synonyms, stemming and deleting "Stop Words". The results showed that the use of Cos...

متن کامل

DIT: Summarisation and Semantic Expansion in Evaluating Semantic Similarity

This paper describes an approach to implementing a tool for evaluating semantic similarity. We investigated the potential benefits of (1) using text summarisation to narrow down the comparison to the most important concepts in both texts, and (2) leveraging WordNet information to increase usefulness of cosine comparisons of short texts. In our experiments, text summarisation using a graph-based...

متن کامل

A Similarity - based Probability Model for Latent Semantic IndexingChris

A dual probability model is constructed for the Latent Semantic Indexing (LSI) using the cosine similarity measure. Both the document-document similarity matrix and the term-term similarity matrix naturally arise from the maximum likelihood estimation of the model parameters, and the optimal solutions are the latent semantic vectors of of LSI. Dimensionality reduction is justiied by the statist...

متن کامل

Textual Spatial Cosine Similarity

When dealing with document similarity many methods exist today, like cosine similarity. More complex methods are also available based on the semantic analysis of textual information, which are computationally expensive and rarely used in the real time feeding of content as in enterprisewide search environments. To address these real-time constraints, we developed a new measure of document simil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012